Search CORE

36 research outputs found

Hybridization expansion Monte Carlo simulation of multi-orbital quantum impurity problems: matrix product formalism and improved sampling

Author: Dolfi Michele
Shinaoka Hiroshi
Troyer Matthias
Werner Philipp
Publication venue
Publication date: 30/06/2014
Field of study

We explore two complementary modifications of the hybridization-expansion continuous-time Monte Carlo method, aiming at large multi-orbital quantum impurity problems. One idea is to compute the imaginary-time propagation using a matrix product state representation. We show that bond dimensions considerably smaller than the dimension of the Hilbert space are sufficient to obtain accurate results and that this approach scales polynomially, rather than exponentially with the number of orbitals. Based on scaling analyses, we conclude that a matrix product state implementation will outperform the exact-diagonalization based method for quantum impurity problems with more than 12 orbitals. The second idea is an improved Monte Carlo sampling scheme which is applicable to all variants of the hybridization expansion method. We show that this so-called sliding window sampling scheme speeds up the simulation by at least an order of magnitude for a broad range of model parameters, with the largest improvements at low temperature

arXiv.org e-Print Archive

RERO DOC Digital Library

Matrix Product State applications for the ALPS project

Author: Bauer Bela
Dolfi Michele
Ewart Timothée
Giamarchi Thierry
Kantian Adrian
Keller Sebastian
Kosenkov Alexandr
Troyer Matthias
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

The density-matrix renormalization group method has become a standard computational approach to the low-energy physics as well as dynamics of low-dimensional quantum systems. In this paper, we present a new set of applications, available as part of the ALPS package, that provide an efficient and flexible implementation of these methods based on a matrix-product state (MPS) representation. Our applications implement, within the same framework, algorithms to variationally find the ground state and low-lying excited states as well as simulate the time evolution of arbitrary one-dimensional and two-dimensional models. Implementing the conservation of quantum numbers for generic Abelian symmetries, we achieve performance competitive with the best codes in the community. Example results are provided for (i) a model of itinerant fermions in one dimension and (ii) a model of quantum magnetism.Comment: 11+5 pages, 8 figures, 2 example

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Archive ouverte UNIGE

Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale

Author: Auer Christoph
Bekas Costas
Dolfi Michele
Staar Peter W J
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/05/2018
Field of study

Over the past few decades, the amount of scientific articles and technical literature has increased exponentially in size. Consequently, there is a great need for systems that can ingest these documents at scale and make the contained knowledge discoverable. Unfortunately, both the format of these documents (e.g. the PDF format or bitmap images) as well as the presentation of the data (e.g. complex tables) make the extraction of qualitative and quantitive data extremely challenging. In this paper, we present a modular, cloud-based platform to ingest documents at scale. This platform, called the Corpus Conversion Service (CCS), implements a pipeline which allows users to parse and annotate documents (i.e. collect ground-truth), train machine-learning classification algorithms and ultimately convert any type of PDF or bitmap-documents to a structured content representation format. We will show that each of the modules is scalable due to an asynchronous microservice architecture and can therefore handle massive amounts of documents. Furthermore, we will show that our capability to gather ground-truth is accelerated by machine-learning algorithms by at least one order of magnitude. This allows us to both gather large amounts of ground-truth in very little time and obtain very good precision/recall metrics in the range of 99\% with regard to content conversion to structured output. The CCS platform is currently deployed on IBM internal infrastructure and serving more than 250 active users for knowledge-engineering project engagements.Comment: Accepted paper at KDD 2018 conferenc

arXiv.org e-Print Archive

Crossref

ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

Author: Auer Christoph
Dolfi Michele
Livathinos Nikolaos
Lysak Maksym
Nassar Ahmed
Staar Peter
Publication venue
Publication date: 24/05/2023
Field of study

Transforming documents into machine-processable representations is a challenging task due to their complex structures and variability in formats. Recovering the layout structure and content from PDF files or scanned material has remained a key problem for decades. ICDAR has a long tradition in hosting competitions to benchmark the state-of-the-art and encourage the development of novel solutions to document layout understanding. In this report, we present the results of our \textit{ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents}, which posed the challenge to accurately segment the page layout in a broad range of document styles and domains, including corporate reports, technical literature and patents. To raise the bar over previous competitions, we engineered a hard competition dataset and proposed the recent DocLayNet dataset for training. We recorded 45 team registrations and received official submissions from 21 teams. In the presented solutions, we recognize interesting combinations of recent computer vision models, data augmentation strategies and ensemble methods to achieve remarkable accuracy in the task we posed. A clear trend towards adoption of vision-transformer based methods is evident. The results demonstrate substantial progress towards achieving robust and highly generalizing methods for document layout understanding.Comment: ICDAR 2023, 10 pages, 4 figure

arXiv.org e-Print Archive